Software Development
Overview of Statistical Analysis and Modeling in R
Final Exam: Statistical Analysis and Modeling in R
Statistical Analysis and Modeling in R: Building Regularized Models & Ensemble Models
Statistical Analysis and Modeling in R: Performing Classification
Statistical Analysis and Modeling in R: Performing Clustering
Statistical Analysis and Modeling in R: Performing Regression Analysis
Statistical Analysis and Modeling in R: Statistical Analysis on Your Data
Statistical Analysis and Modeling in R: Understanding & Interpreting Statistical Tests
Statistical Analysis and Modeling in R: Working with Probability Distributions

Final Exam: Statistical Analysis and Modeling in R

Course Number:
it_fedawr_04_enus
Lesson Objectives

Final Exam: Statistical Analysis and Modeling in R

  • analyze data that follows a uniform distribution
  • check the assumptions of the paired samples t-test
  • compare and contrast population metrics with sample metrics
  • construct hypothesis statements in the context of a statistical test
  • describe the bias-variance trade-off
  • estimate parameters of the population and interpret confidence intervals
  • examine and interpret the data for regression
  • examine and visualize data for regression
  • explore and pre-process data before model fitting
  • explore and visualize the relationships in data
  • find the optimal number of clusters using the elbow method and Silhouette score
  • fit and interpret the S-curve of logistic regression
  • fit a straight line on data to build a regression model and evaluate the model
  • implement the one-sample t-test and interpret results
  • interpret QQ plots for normally and non-normally distributed data
  • investigate and visualize data before fitting a model
  • outline the main characteristics of ensemble learning
  • perform regression using decision trees
  • perform regression using random forest
  • perform simple linear regression with a single predictor
  • perform the one-sample t-test and interpret results
  • perform the Wilcoxon signed-rank test
  • posit the null hypothesis and alternative hypothesis of a statistical test
  • recall characteristics of overfitted and underfitted models
  • recall implications of the p-value and significance level alpha
  • recall measures of central tendency and measures of dispersion
  • recall the assumptions made by the ANOVA test
  • recall the assumptions made by the one-sample t-test
  • recall the assumptions made by the two-sample t-test
  • recall the basic characteristics of machine learning models
  • recall the basic structure of decision tree models
  • recall the characteristics of discrete and continuous probability distributions
  • recall the key metrics to evaluate classifiers
  • recall the sets of statistical tools used to understand data
  • recall the techniques used to evaluate clustering models
  • sample and analyze data that follows a uniform distribution
  • summarize the differences and use cases for parametric and non-parametric models
  • train a model on an imbalanced dataset
  • train and evaluate a logistic regression model
  • use decision tree models for prediction

Overview/Description

Final Exam: Statistical Analysis and Modeling in R will test your knowledge and application of the topics presented throughout the Statistical Analysis and Modeling in R track of the Skillsoft Aspire Data Analysis with R Journey.



Target

Prerequisites: none

Statistical Analysis and Modeling in R: Building Regularized Models & Ensemble Models

Course Number:
it_dasamrdj_07_enus
Lesson Objectives

Statistical Analysis and Modeling in R: Building Regularized Models & Ensemble Models

  • discover the key concepts covered in this course
  • recall characteristics of overfitted and underfitted models
  • describe the bias-variance trade-off
  • examine and interpret the data for regression
  • perform ordinary least squares (OSL) regression
  • prepare data to build regularized regression models
  • perform and evaluate Ridge regression
  • perform and evaluate Lasso regression
  • perform and evaluate ElasticNet regression
  • outline the main characteristics of ensemble learning
  • examine and visualize data for regression
  • perform regression using decision trees
  • perform regression using random forest
  • summarize the key concepts covered in this course

Overview/Description
Understanding the bias-variance trade-off allows data scientists to build generalizable models that perform well on test data. Machine learning models are considered a good fit if they can extract general patterns or dominant trends in the training data and use these to make predictions on unseen instances. Use this course to discover what it means for your model to be a good fit for the training data. Identify underfit and overfit models and what the bias-variance trade-off represents in machine learning. Mitigate overfitting on training data using regularized regression models, train and evaluate models built using ridge regression, lasso regression, and ElasticNet regression, and implement ensemble learning using the random forest model. When you're done with this course, you'll have the skills and knowledge to train models that learn general patterns using regularized models and ensemble learning.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Performing Classification

Course Number:
it_dasamrdj_05_enus
Lesson Objectives

Statistical Analysis and Modeling in R: Performing Classification

  • discover the key concepts covered in this course
  • recall the key metrics to evaluate classifiers
  • fit and interpret the S-curve of logistic regression
  • train and evaluate a logistic regression model
  • train and evaluate a logistic model using all predictors
  • train a model on an imbalanced dataset
  • interpret the significance of coefficients, confidence intervals, and odds ratios
  • evaluate a model built using an imbalanced dataset
  • use resampling techniques to improve the model
  • recall the basic structure of decision tree models
  • explore and pre-process data before model fitting
  • use decision tree models for prediction
  • summarize the key concepts covered in this course

Overview/Description
Classification models are used to classify or categorize data points into two or more categories. Learn how these models work and how you can evaluate your classification models using the confusion matrix and metrics such as accuracy, precision, and recall. During this course, you'll perform classification using both logistic regression and an imbalanced dataset. You'll also examine why precision or recall scores may be better metrics than accuracy to evaluate such models. Furthermore, build a classification model using decision trees, visualize the tree structure, and explore the variable importance assigned by this tree structure to understand and interpret the model. When you've finished this course, you'll be able to confidently use logistic regression and decision trees to build classification models and evaluate your models using accuracy, precision, and recall.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Performing Clustering

Course Number:
it_dasamrdj_06_enus
Lesson Objectives

Statistical Analysis and Modeling in R: Performing Clustering

  • discover the key concepts covered in this course
  • recall the techniques used to evaluate clustering models
  • investigate and visualize data before fitting a model
  • perform k-means clustering and interpret clustering results
  • find the optimal number of clusters using the elbow method and Silhouette score
  • perform k-means clustering on multi-attribute data
  • summarize the key concepts covered in this course

Overview/Description
Clustering is an unsupervised learning algorithm that self-discovers patterns in data and helps identify logical groupings. Use this course to distinguish between supervised and unsupervised learning and recognize how regression and classification algorithms differ from clustering. Examine the basic principles of clustering models and how k-means clustering finds logical groupings in your data. Learn the evaluation techniques used in clustering and find the optimal number of clusters in your data using both the elbow method and the Silhouette score. Perform clustering on a dataset with multiple attributes and visualize clusters in your data using principal components. When you've completed this course, you'll be able to find groupings in your data using k-means clustering and compute the optimal number of clusters for your data.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Performing Regression Analysis

Course Number:
it_dasamrdj_04_enus
Lesson Objectives

Statistical Analysis and Modeling in R: Performing Regression Analysis

  • discover the key concepts covered in this course
  • recall the basic characteristics of machine learning models
  • examine how to fit a straight line on data to build a regression model and evaluate the model
  • identify and visualize the relationships in data
  • perform simple linear regression with a single predictor
  • perform multiple regression using multiple predictors
  • apply the regression model to get predictions for test data
  • build a regression model using cross-validation
  • summarize the key concepts covered in this course

Overview/Description
Regression models are used to predict continuous values and are some of the most commonly used machine learning models. Use this course to grasp what exactly machine learning (ML) algorithms are and how you can use ML models to predict outcomes based on input data. Learn how regression models work, train them, and evaluate regression results using metrics such as R2 and RMSE. Perform regression analysis in R using the ordinary least squares regression. Build models using simple and multiple regression and train a regression model using cross-validation. Upon completing this course, you'll be able to perform regression to predict continuous values and evaluate these models using metrics such as the R2 and adjusted R2.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Statistical Analysis on Your Data

Course Number:
it_dasamrdj_03_enus
Lesson Objectives

Statistical Analysis and Modeling in R: Statistical Analysis on Your Data

  • discover the key concepts covered in this course
  • illustrate the assumptions made one-sample t-tests
  • perform the one-sample t-test and interpret results
  • perform variations of the one-sample t-test, namely two-sided, greater, and less one-sample t-tests, and then interpret results
  • perform the one-sample Z-test and interpret results
  • illustrate the assumptions made by the two-sample t-test
  • run the two-sample t-test for equal variances
  • run Welch's two-sample t-test for unequal variances
  • perform the paired samples t-test
  • check the assumptions of the paired samples t-test for violation
  • perform the Wilcoxon signed-rank test
  • identify the assumptions made by the ANOVA test
  • run the one-way ANOVA test and the Tukey HSD test
  • run the two-way ANOVA test for additive and interaction models
  • summarize the differences and use cases for parametric and non-parametric models
  • summarize the key concepts covered in this course

Overview/Description
Hypothesis testing determines whether the educated guesses you've made about your data should be accepted or rejected. T-tests and ANOVA tests are some of the most commonly used methods in hypothesis testing. Knowing how to perform and interpret these tests are core data scientist skills. In this course, get hands-on running statistical tests on your sample data. Test assumptions made by statistical tests, run T-tests, perform ANOVA analysis, and interpret the results. Perform the one-sample t-test and the one-sample Z-test. Run the two-sample t-test to compare data from different groups or categories and the paired samples t-test to compare data across time. When you're finished with this course, you'll have the know-how to run and interpret statistical tests using the R programming language.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Understanding & Interpreting Statistical Tests

Course Number:
it_dasamrdj_02_enus
Lesson Objectives

Statistical Analysis and Modeling in R: Understanding & Interpreting Statistical Tests

  • discover the key concepts covered in this course
  • recall measures of central tendency and measures of dispersion
  • estimate parameters of the population and interpret confidence intervals
  • construct hypothesis statements in the context of a statistical test
  • posit the null hypothesis and alternative hypothesis of a statistical test
  • recall implications of the p-value and significance level alpha
  • interpret p-values using significance level alpha
  • recognize the use of t-tests to compare the means of two groups
  • explore the ANOVA (analysis of variance) test to compare the means of two or more groups
  • summarize the key concepts covered in this course

Overview/Description
Statistical analysis involves making educated guesses known as hypotheses and testing them to see if they hold up. Use this course to learn how to apply hypothesis testing to your data. Examine the use of descriptive statistics to summarize data and inferential statistics to draw conclusions. Learn how population parameters differ from summary statistics and how confidence intervals are used. Discover how to perform hypothesis testing on sample data, construct null and alternative hypotheses, and interpret the results of your statistical tests. Investigate the significance of the p-value of a statistical test and how it can be interpreted using the significance threshold or alpha level. Additionally, examine the most commonly used statistical tests, the T-test and the analysis of variance (ANOVA). When you're done, you'll have the confidence to set up the null and alternative hypotheses for your tests and interpret the results.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Working with Probability Distributions

Course Number:
it_dasamrdj_01_enus
Lesson Objectives

Statistical Analysis and Modeling in R: Working with Probability Distributions

  • discover the key concepts covered in this course
  • recall the sets of statistical tools used to understand data
  • compare and contrast population metrics with sample metrics
  • recall the characteristics of discrete and continuous probability distributions
  • sample and analyze data that follows uniform distribution
  • sample and analyze data which follows binomial distribution
  • calculate probabilities of events in the binomial distribution
  • sample and analyze data which follows uniform distribution
  • examine and interpret normal distributions and exponential distributions
  • interpret QQ plots for normally and non-normally distributed data
  • use QQ plots to compare samples from different distributions
  • summarize the key concepts covered in this course

Overview/Description
Interpreting data is a core pre-processing step in data analysis and modeling. Use this course to practice using various dynamic statistical tools to explore and understand your data. During this course, you'll explore population distributions to model random variables, work with discrete and continuous probability distributions, and use discrete probability distribution types, such as the uniform, binomial, and Poisson distributions. You'll also examine continuous distributions, such as the normal and the exponential distributions. You'll round the course off by learning how to read and interpret QQ plots, which can be used to compare the distributions of two samples of data. When you're finished, you'll be able to use probability distributions to model events and understand your data.

Target

Prerequisites: none

Close Chat Live